precision matrix estimation
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- Banking & Finance (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
- (3 more...)
Non-Asymptotic Analysis of Data Augmentation for Precision Matrix Estimation
Morisset, Lucas, Hardy, Adrien, Durmus, Alain
This paper addresses the problem of inverse covariance (also known as precision matrix) estimation in high-dimensional settings. Specifically, we focus on two classes of estimators: linear shrinkage estimators with a target proportional to the identity matrix, and estimators derived from data augmentation (DA). Here, DA refers to the common practice of enriching a dataset with artificial samples--typically generated via a generative model or through random transformations of the original data--prior to model fitting. For both classes of estimators, we derive estimators and provide concentration bounds for their quadratic error. This allows for both method comparison and hyperparameter tuning, such as selecting the optimal proportion of artificial samples. On the technical side, our analysis relies on tools from random matrix theory. We introduce a novel deterministic equivalent for generalized resolvent matrices, accommodating dependent samples with specific structure. We support our theoretical results with numerical experiments.
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
A General Class of Model-Free Dense Precision Matrix Estimators
Stojnic, Mehmet Caner Agostino Capponi Mihailo
We introduce prototype consistent model-free, dense precision matrix estimators that have broad application in economics. Using quadratic form concentration inequalities and novel algebraic characterizations of confounding dimension reductions, we are able to: (i) obtain non-asymptotic bounds for precision matrix estimation errors and also (ii) consistency in high dimensions; (iii) uncover the existence of an intrinsic signal-to-noise -- underlying dimensions tradeoff; and (iv) avoid exact population sparsity assumptions. In addition to its desirable theoretical properties, a thorough empirical study of the S&P 500 index shows that a tuning parameter-free special case of our general estimator exhibits a doubly ascending Sharpe Ratio pattern, thereby establishing a link with the famous double descent phenomenon dominantly present in recent statistical and machine learning literature.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > North Carolina (0.04)
- Asia > Singapore (0.04)
- Asia > Malaysia (0.04)
- Banking & Finance > Trading (1.00)
- Banking & Finance > Economy (0.92)
- Government > Regional Government > North America Government > United States Government (0.45)
Trans-Glasso: A Transfer Learning Approach to Precision Matrix Estimation
Zhao, Boxin, Ma, Cong, Kolar, Mladen
Precision matrix estimation is essential in various fields, yet it is challenging when samples for the target study are limited. Transfer learning can enhance estimation accuracy by leveraging data from related source studies. We propose Trans-Glasso, a two-step transfer learning method for precision matrix estimation. First, we obtain initial estimators using a multi-task learning objective that captures shared and unique features across studies. Then, we refine these estimators through differential network estimation to adjust for structural differences between the target and source precision matrices. Under the assumption that most entries of the target precision matrix are shared with source matrices, we derive non-asymptotic error bounds and show that Trans-Glasso achieves minimax optimality under certain conditions. Extensive simulations demonstrate Trans Glasso's superior performance compared to baseline methods, particularly in small-sample settings. We further validate Trans-Glasso in applications to gene networks across brain tissues and protein networks for various cancer subtypes, showcasing its effectiveness in biological contexts. Additionally, we derive the minimax optimal rate for differential network estimation, representing the first such guarantee in this area.
- North America > United States > California (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Hematology (0.92)
- Health & Medicine > Therapeutic Area > Oncology (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)
Distributed Networked Multi-task Learning
Hong, Lingzhou, Garcia, Alfredo
--We consider a distributed multi-task learning scheme that accounts for multiple linear model estimation tasks with heterogeneous and/or correlated data streams. We assume that nodes can be partitioned into groups corresponding to different learning tasks and communicate according to a directed network topology. Each node estimates a linear model asynchronously and is subject to local (within-group) regularization and global (across groups) regularization terms targeting noise reduction and generalization performance improvement respectively. We provide a finite-time characterization of convergence of the estimators and task relation and illustrate the scheme's general applicability in two examples: random field temperature estimation and modeling student performance from different academic districts. Index T erms --Multi-task Learning, Distributed Optimization, Network-based computing systems, Multi-agent systems. N the current age of big data, many applications often face the challenge of processing large and complex datasets, which are usually not available in a single place but rather distributed across multiple locations. Approaches that require data to be aggregated in a central location may be subject to significant scalability and storage challenges. In other scenarios, data are scattered across different sites and owned by different individuals or organizations. Data privacy and security requirements make it difficult to merge such data in an easy way. In both contexts, Distributed Learning (DL) [1]-[3] can provide feasible solutions by building high-performance models shared among multiple nodes while maintaining user privacy and data confidentiality. DL aims to build a collective machine learning model based on the data from multiple computing nodes that can process and store data and are connected via networks. Nodes can utilize neighboring information to improve their own performance: rather than sharing raw data, they only exchange model information such as model parameters or gradients to avoid revealing sensitive information. This work was supported in part by the National Science Foundation under A ward ECCS-1933878 and in part by the Air Force Office of Scientific Research under Grant 15RT0767. Lingzhou Hong and Alfredo Garcia are with the Department of Industrial & Systems Engineering, Texas A&M University, College Station, TX 77843 USA (e-mail: { hlz, alfredo.garcia
- Information Technology > Security & Privacy (1.00)
- Education > Assessment & Standards > Student Performance (0.34)
Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix
Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the matrix is sparse, methods like graphical Lasso, graphical SCAD, and CLIME are popular for estimating GGMs. While frequentist methods are well-studied, Bayesian approaches for (unstructured) sparse precision matrices are less explored. The graphical horseshoe estimate by \citet{li2019graphical}, applying the global-local horseshoe prior, shows superior empirical performance, but theoretical work for sparse precision matrix estimations using shrinkage priors is limited. This paper addresses these gaps by providing concentration results for the tempered posterior with the fully specified horseshoe prior in high-dimensional settings. Moreover, we also provide novel theoretical results for model misspecification, offering a general oracle inequality for the posterior.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Schur's Positive-Definite Network: Deep Learning in the SPD cone with structure
Pouliquen, Can, Massias, Mathurin, Vayer, Titouan
Estimating matrices in the symmetric positive-definite (SPD) cone is of interest for many applications ranging from computer vision to graph learning. While there exist various convex optimization-based estimators, they remain limited in expressivity due to their model-based approach. The success of deep learning has thus led many to use neural networks to learn to estimate SPD matrices in a data-driven fashion. For learning structured outputs, one promising strategy involves architectures designed by unrolling iterative algorithms, which potentially benefit from inductive bias properties. However, designing correct unrolled architectures for SPD learning is difficult: they either do not guarantee that their output has all the desired properties, rely on heavy computations, or are overly restrained to specific matrices which hinders their expressivity. In this paper, we propose a novel and generic learning module with guaranteed SPD outputs called SpodNet, that also enables learning a larger class of functions than existing approaches. Notably, it solves the challenging task of learning jointly SPD and sparse matrices. Our experiments demonstrate the versatility of SpodNet layers.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
- North America > United States > Minnesota (0.04)
Large Scale Distributed Sparse Precision Estimation
We consider the problem of sparse precision matrix estimation in high dimensions using the CLIME estimator, which has several desirable theoretical properties. We present an inexact alternating direction method of multiplier (ADMM) algorithm for CLIME, and establish rates of convergence for both the objective and optimality conditions. Further, we develop a large scale distributed framework for the computations, which scales to millions of dimensions and trillions of parameters, using hundreds of cores. The proposed framework solves CLIME in columnblocks and only involves elementwise operations and parallel matrix multiplications. We evaluate our algorithm on both shared-memory and distributed-memory architectures, which can use block cyclic distribution of data and parameters to achieve load balance and improve the efficiency in the use of memory hierarchies. Experimental results show that our algorithm is substantially more scalable than state-of-the-art methods and scales almost linearly with the number of cores.
- North America > United States > Minnesota (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Learning linear non-Gaussian directed acyclic graph with diverging number of nodes
Zhao, Ruixuan, He, Xin, Wang, Junhui
Acyclic model, often depicted as a directed acyclic graph (DAG), has been widely employed to represent directional causal relations among collected nodes. In this article, we propose an efficient method to learn linear non-Gaussian DAG in high dimensional cases, where the noises can be of any continuous non-Gaussian distribution. This is in sharp contrast to most existing DAG learning methods assuming Gaussian noise with additional variance assumptions to attain exact DAG recovery. The proposed method leverages a novel concept of topological layer to facilitate the DAG learning. Particularly, we show that the topological layers can be exactly reconstructed in a bottom-up fashion, and the parent-child relations among nodes in each layer can also be consistently established. More importantly, the proposed method does not require the faithfulness or parental faithfulness assumption which has been widely assumed in the literature of DAG learning. Its advantage is also supported by the numerical comparison against some popular competitors in various simulated examples as well as a real application on the global spread of COVID-19.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > China > Hong Kong > Kowloon (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- (5 more...)